17 research outputs found
Transfer-Learned Potential Energy Surfaces: Towards Microsecond-Scale Molecular Dynamics Simulations in the Gas Phase at CCSD(T) Quality
The rise of machine learning has greatly influenced the field of
computational chemistry, and that of atomistic molecular dynamics simulations
in particular. One of its most exciting prospects is the development of
accurate, full-dimensional potential energy surfaces (PESs) for molecules and
clusters, which, however, often require thousands to tens of thousands of ab
initio data points restricting the community to medium sized molecules and/or
lower levels of theory (e.g. DFT). Transfer learning, which improves a global
PES from a lower to a higher level of theory, offers a data efficient
alternative requiring only a fraction of the high level data (on the order of
100 are found to be sufficient for malonaldehyde). The present work
demonstrates that even with Hartree-Fock theory and a double-zeta basis set as
the lower level model, transfer learning yields CCSD(T)-level quality for
H-transfer barrier energies, harmonic frequencies and H-transfer tunneling
splittings. Most importantly, finite-temperature molecular dynamics simulations
on the sub-microsecond time scale in the gas phase are possible and the
infrared spectra determined from the transfer learned PESs are in good
agreement with experiment. It is concluded that routine, long-time atomistic
simulations on PESs fulfilling CCSD(T)-standards become possible
Isomerization and Decomposition Reactions of Acetaldehyde Relevant to Atmospheric Processes from Dynamics Simulations on Neural Network-Based Potential Energy Surfaces
Acetaldehyde (AA) isomerization (to vinylalcohol, VA) and decomposition (into
either CO+CH and H+HCCO) is studied using a fully dimensional,
reactive potential energy surface represented as a neural network (NN). The NN,
trained on 432'399 reference structures from MP2/aug-cc-pVTZ calculations has a
MAE of 0.0453 kcal/mol and an RMSE of 1.186 kcal/mol for a test set of 27'399
structures. For the isomerization process AA VA the minimum
dynamical path implies that the C-H vibration, and the C-C-H (with H being the
transferring H-atom) and the C-C-O angles are involved to surmount the 68.2
kcal/mol barrier. Using an excess energy of 93.6 kcal/mol - the energy
available in the solar spectrum and sufficient to excite to the first
electronically excited state - to initialize the molecular dynamics, no
isomerization to VA is observed on the 500 ns time scale. Only with excess
energies of 127.6 kcal/mol (including the zero point energy of the AA
molecule), isomerization occurs on the nanosecond time scale. Given that
collisional de-excitation at atmospheric conditions in the stratosphere occurs
on the 100 ns time scale, it is concluded that formation of VA following
photoexcitation of AA from actinic photons is unlikely. This also limits the
relevance of this reaction pathway to be a source for formic acid
Reactive Dynamics and Spectroscopy of Hydrogen Transfer from Neural Network-Based Reactive Potential Energy Surfaces
The in silico exploration of chemical, physical and biological systems
requires accurate and efficient energy functions to follow their nuclear
dynamics at a molecular and atomistic level. Recently, machine learning tools
gained a lot of attention in the field of molecular sciences and simulations
and are increasingly used to investigate the dynamics of such systems. Among
the various approaches, artificial neural networks (NNs) are one promising tool
to learn a representation of potential energy surfaces. This is done by
formulating the problem as a mapping from a set of atomic positions
and nuclear charges to a potential energy .
Here, a fully-dimensional, reactive neural network representation for
malonaldehyde (MA), acetoacetaldehyde (AAA) and acetylacetone (AcAc) is
learned. It is used to run finite-temperature molecular dynamics simulations,
and to determine the infrared spectra and the hydrogen transfer rates for the
three molecules. The finite-temperature infrared spectrum for MA based on the
NN learned on MP2 reference data provides a realistic representation of the
low-frequency modes and the H-transfer band whereas the CH vibrations are
somewhat too high in frequency. For AAA it is demonstrated that the IR
spectroscopy is sensitive to the position of the transferring hydrogen at
either the OCH- or OCCH end of the molecule. For the hydrogen transfer
rates it is demonstrated that the O-O vibration is a gating mode and largely
determines the rate at which the hydrogen is transferred between the donor and
acceptor. Finally, possibilities to further improve such NN-based potential
energy surfaces are explored. They include the transferability of an NN-learned
energy function across chemical species (here methylation) and transfer
learning from a lower level of reference data (MP2) to a higher level of theory
(pair natural orbital-LCCSD(T))
Transfer learning for affordable and high quality tunneling splittings from instanton calculations
The combination of transfer learning (TL) a low level potential energy
surface (PES) to a higher level of electronic structure theory together with
ring-polymer instanton (RPI) theory is explored and applied to malonaldehyde.
The RPI approach provides a semiclassical approximation of the tunneling
splitting and depends sensitively on the accuracy of the PES. With second order
M{\o}ller-Plesset perturbation theory (MP2) as the low-level (LL) model and
energies and forces from coupled cluster singles, doubles and perturbative
triples (CCSD(T)) as the high-level (HL) model, it is demonstrated that CCSD(T)
information from only 25 to 50 judiciously selected structures along and around
the instanton path suffice to reach HL-accuracy for the tunneling splitting. In
addition, the global quality of the HL-PES is demonstrated through a mean
average error of 0.3 kcal/mol for energies up to 40 kcal/mol above the minimum
energy structure (a factor of 2 higher than the energies employed during TL)
and cm for harmonic frequencies compared with computationally
challenging normal mode calculations at the CCSD(T) level
Machine Learning for Observables: Reactant to Product State Distributions for Atom-Diatom Collisions
Machine learning-based models to predict product state distributions from a
distribution of reactant conditions for atom-diatom collisions are presented
and quantitatively tested. The models are based on function-, kernel- and
grid-based representations of the reactant and product state distributions.
While all three methods predict final state distributions from explicit
quasi-classical trajectory simulations with R > 0.998, the grid-based
approach performs best. Although a function-based approach is found to be more
than two times better in computational performance, the kernel- and grid-based
approaches are preferred in terms of prediction accuracy, practicability and
generality. The function-based approach also suffers from lacking a general set
of model functions. Applications of the grid-based approach to nonequilibrium,
multi-temperature initial state distributions are presented, a situation common
to energy distributions in hypersonic flows. The role of such models in Direct
Simulation Monte Carlo and computational fluid dynamics simulations is also
discussed
PhysNet Meets CHARMM: A Framework for Routine Machine Learning / Molecular Mechanics Simulations
Full dimensional potential energy surfaces (PESs) based on machine learning
(ML) techniques provide means for accurate and efficient molecular simulations
in the gas- and condensed-phase for various experimental observables ranging
from spectroscopy to reaction dynamics. Here, the MLpot extension with PhysNet
as the ML-based model for a PES is introduced into the newly developed pyCHARMM
API. To illustrate conceiving, validating, refining and using a typical
workflow, para-chloro-phenol is considered as an example. The main focus is on
how to approach a concrete problem from a practical perspective and
applications to spectroscopic observables and the free energy for the -OH
torsion in solution are discussed in detail. For the computed IR spectra in the
fingerprint region the computations for para-chloro-phenol in water are in good
qualitative agreement with experiment carried out in CCl. Also, relative
intensities are largely consistent with experimental findings. The barrier for
rotation of the -OH group increases from kcal/mol in the gas phase
to kcal/mol from simulations in water due to favourable H-bonding
interactions of the -OH group with surrounding water molecules.Comment: 38 pages, 11 figure
ML Models of Vibrating HCO: Comparing Reproducing Kernels, FCHL and PhysNet
Machine Learning (ML) has become a promising tool for improving the quality
of atomistic simulations. Using formaldehyde as a benchmark system for
intramolecular interactions, a comparative assessment of ML models based on
state-of-the-art variants of deep neural networks (NN), reproducing kernel
Hilbert space (RKHS+F), and kernel ridge regression (KRR) is presented.
Learning curves for energies and atomic forces indicate rapid convergence
towards excellent predictions for B3LYP, MP2, and CCSD(T)-F12 reference results
for modestly sized (in the hundreds) training sets. Typically, learning curve
off-sets decay as one goes from NN (PhysNet) to RKHS+F to KRR (FCHL).
Conversely, the predictive power for extrapolation of energies towards new
geometries increases in the same order with RKHS+F and FCHL performing almost
equally. For harmonic vibrational frequencies, the picture is less clear, with
PhysNet and FCHL yielding respectively flat learning at 1 and 0.2
cm no matter which reference method, while RKHS+F models level off for
B3LYP, and exhibit continued improvements for MP2 and CCSD(T)-F12.
Finite-temperature molecular dynamics (MD) simulations with the same initial
conditions yield indistinguishable infrared spectra with good performance
compared with experiment except for the high-frequency modes involving hydrogen
stretch motion which is a known limitation of MD for vibrational spectroscopy.
For sufficiently large training set sizes all three models can detect
insufficient convergence (``noise'') of the reference electronic structure
calculations in that the learning curves level off. Transfer learning (TL) from
B3LYP to CCSD(T)-F12 with PhysNet indicates that additional improvements in
data efficiency can be achieved
Hydration dynamics and IR spectroscopy of 4-fluorophenol
Halogenated groups are relevant in pharmaceutical applications and potentially useful spectroscopic probes for infrared spectroscopy. In this work, the structural dynamics and infrared spectroscopy of para-fluorophenol (F-PhOH) and phenol (PhOH) is investigated in the gas phase and in water using a combination of experiment and molecular dynamics (MD) simulations. The gas phase and solvent dynamics around F-PhOH and PhOH is characterized from atomistic simulations using empirical energy functions with point charges or multipoles for the electrostatics, Machine Learning (ML) based parametrizations and with full ab initio (QM) and mixed Quantum Mechanical/Molecular Mechanics (QM/MM) simulations with a particular focus on the CF- and OH-stretch region. The CF-stretch band is heavily mixed with other modes whereas the OH-stretch in solution displays a characteristic high-frequency peak around 3600 cm−1 most likely associated with the –OH group of PhOH and F-PhOH together with a characteristic progression below 3000 cm−1 due to coupling with water modes which is also reproduced by several of the simulations. Solvent and radial distribution functions indicate that the CF-site is largely hydrophobic except for simulations using point charges which renders them unsuited for correctly describing hydration and dynamics around fluorinated sites. The hydrophobic character of the CF-group is particularly relevant for applications in pharmaceutical chemistry with a focus on local hydration and interaction with the surrounding protein